2,402 research outputs found

    Reply From the Authors

    Get PDF

    Predicting deleterious nsSNPs: an analysis of sequence and structural attributes

    Get PDF
    BACKGROUND: There has been an explosion in the number of single nucleotide polymorphisms (SNPs) within public databases. In this study we focused on non-synonymous protein coding single nucleotide polymorphisms (nsSNPs), some associated with disease and others which are thought to be neutral. We describe the distribution of both types of nsSNPs using structural and sequence based features and assess the relative value of these attributes as predictors of function using machine learning methods. We also address the common problem of balance within machine learning methods and show the effect of imbalance on nsSNP function prediction. We show that nsSNP function prediction can be significantly improved by 100% undersampling of the majority class. The learnt rules were then applied to make predictions of function on all nsSNPs within Ensembl. RESULTS: The measure of prediction success is greatly affected by the level of imbalance in the training dataset. We found the balanced dataset that included all attributes produced the best prediction. The performance as measured by the Matthews correlation coefficient (MCC) varied between 0.49 and 0.25 depending on the imbalance. As previously observed, the degree of sequence conservation at the nsSNP position is the single most useful attribute. In addition to conservation, structural predictions made using a balanced dataset can be of value. CONCLUSION: The predictions for all nsSNPs within Ensembl, based on a balanced dataset using all attributes, are available as a DAS annotation. Instructions for adding the track to Ensembl are a

    Safety Implications of High-Field MRI: Actuation of Endogenous Magnetic Iron Oxides in the Human Body

    Get PDF
    Background: Magnetic Resonance Imaging scanners have become ubiquitous in hospitals and high-field systems (greater than 3 Tesla) are becoming increasingly common. In light of recent European Union moves to limit high-field exposure for those working with MRI scanners, we have evaluated the potential for detrimental cellular effects via nanomagnetic actuation of endogenous iron oxides in the body.Methodology: Theoretical models and experimental data on the composition and magnetic properties of endogenous iron oxides in human tissue were used to analyze the forces on iron oxide particles.Principal Finding and Conclusions: Results show that, even at 9.4 Tesla, forces on these particles are unlikely to disrupt normal cellular function via nanomagnetic actuation

    Native-state stability determines the extent of degradation relative to secretion of protein variants from Pichia pastoris.

    Get PDF
    We have investigated the relationship between the stability and secreted yield of a series of mutational variants of human lysozyme (HuL) in Pichia pastoris. We show that genes directly involved in the unfolded protein response (UPR), ER-associated degradation (ERAD) and ER-phagy are transcriptionally up-regulated more quickly and to higher levels in response to expression of more highly-destabilised HuL variants and those variants are secreted to lower yield. We also show that the less stable variants are retained within the cell and may also be targeted for degradation. To explore the relationship between stability and secretion further, two different single-chain-variable-fragment (scFv) antibodies were also expressed in P. pastoris, but only one of the scFvs gave rise to secreted protein. The non-secreted scFv was detected within the cell and the UPR indicators were pronounced, as they were for the poorly-secreted HuL variants. The non-secreted scFv was modified by changing either the framework regions or the linker to improve the predicted stability of the scFv and secretion was then achieved and the levels of UPR indicators were lowered Our data support the hypothesis that less stable proteins are targeted for degradation over secretion and that this accounts for the decrease in the yields observed. We discuss the secretion of proteins in relation to lysozyme amyloidosis, in particular, and optimised protein secretion, in general

    Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records

    Get PDF
    Abstract Unknown adverse reactions to drugs available on the market present a significant health risk and limit accurate judgement of the cost/benefit trade-off for medications. Machine learning has the potential to predict unknown adverse reactions from current knowledge. We constructed a knowledge graph containing four types of node: drugs, protein targets, indications and adverse reactions. Using this graph, we developed a machine learning algorithm based on a simple enrichment test and first demonstrated this method performs extremely well at classifying known causes of adverse reactions (AUC 0.92). A cross validation scheme in which 10% of drug-adverse reaction edges were systematically deleted per fold showed that the method correctly predicts 68% of the deleted edges on average. Next, a subset of adverse reactions that could be reliably detected in anonymised electronic health records from South London and Maudsley NHS Foundation Trust were used to validate predictions from the model that are not currently known in public databases. High-confidence predictions were validated in electronic records significantly more frequently than random models, and outperformed standard methods (logistic regression, decision trees and support vector machines). This approach has the potential to improve patient safety by predicting adverse reactions that were not observed during randomised trials

    Innovative Test Operations to Support Orion and Future Human Rated Missions

    Get PDF
    This paper describes how the Orion program is implementing new and innovative test approaches and strategies in an evolving development environment. The early flight test spacecraft are evolving in design maturity and complexity requiring significant changes in the ground test operations for each mission. The testing approach for EM-2 is planned to validate innovative Orion production acceptance testing methods to support human exploration missions in the future. Manufacturing and testing at Kennedy Space Center in the Neil Armstrong Operations and Checkout facility will provide a seamless transition directly to the launch site avoiding transportation and checkout of the spacecraft from other locations

    AI chatbots not yet ready for clinical use

    Get PDF
    As large language models (LLMs) expand and become more advanced, so do the natural language processing capabilities of conversational AI, or “chatbots”. OpenAI's recent release, ChatGPT, uses a transformer-based model to enable human-like text generation and question-answering on general domain knowledge, while a healthcare-specific Large Language Model (LLM) such as GatorTron has focused on the real-world healthcare domain knowledge. As LLMs advance to achieve near human-level performances on medical question and answering benchmarks, it is probable that Conversational AI will soon be developed for use in healthcare. In this article we discuss the potential and compare the performance of two different approaches to generative pretrained transformers—ChatGPT, the most widely used general conversational LLM, and Foresight, a GPT (generative pretrained transformer) based model focused on modelling patients and disorders. The comparison is conducted on the task of forecasting relevant diagnoses based on clinical vignettes. We also discuss important considerations and limitations of transformer-based chatbots for clinical use

    The psycho-ENV corpus:Research articles annotated for knowledge discovery on correlating mental diseases and environmental factors

    Get PDF
    While the published scientific literature is used in a biomedical context such as building gene networks for disease gene discovery, it seems to be an undervalued resource with respect to mental illnesses. It has been rarely explored for the purpose of gaining psychopathology insights. This limits our capability of better understanding the underlying mechanisms of mental disorders. In this paper we describe the psycho-env corpus, which aims at annotating published studies for facilitating knowledge discovery on pathologies of mental diseases. Specifically, this corpus focuses on the correlations between mental diseases and environmental factors. We report the first preliminary work of psycho-env on annotating 20 articles about two mental illnesses (bipolar disorder and depression) and two particular environmental factors - light and sunlight. The corpus is available at https://github.com/KHP-Informatics/psycho-env

    A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies

    Get PDF
    Background: Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. Methods: We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. Results: For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study
    corecore